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ABSTRACT 

If used up to their most important recommendations, RDA guidelines lead to a semantic web oriented catalog. Both 
authority and bibliographic records require to be curated, especially under thè point of view of persistent identifiers, 
connecting entities to relevant external resources, thus boosting thè navigability of thè retrieved information and finally of 
thè whole catalog. 

To dramaticaUy improve these goals, also established records need to be provided with links, URIs, or persistent identifiers. 
An automatic approach was studied and applied in two phases to authority records of a Koha catalog. 

A new tool was created and added to thè detailed display of a bibliographic record in order to show valuable information 
for each agent having a responsibility. 
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1. Introduction 

In 2014, thè Library of thè Pontificia Università della Santa Croce started working on authority records 
in order to link them to records of other institutions or projects, especially to thè Virtual International 
Authority File (VIAF)d This decision was a consequence of (1) thè migration to thè open source 
integrated library System Koha^ in 2011 and (2) thè introduction of Resource Description and Access 
(RDA) in thè URBE Network^ Authority records were prepared in a semiautomated way when 
imported into Koha, and new records were registered following Anglo American Cataloguing Rules 
2nd edition (AACR2) and then RDA. RDA was officially adopted by URBE in March 2017. Right 
after it, URBE became a member of thè European RDA Interest Group (EURIG) 

The cataloguers’ staff of thè Pontificia Università della Santa Croce noticed thè importance of 
comparing records with other sources, especially studying how they were registered by national 
libraries, and used in their websites. Por this reason we decided to store identifiers using tag 024 
(MARC 21). Our choice was to add VIAP and ISNI (International Standard Name Identifier)^ 
identifiers to our records, following thè use of some of thè most important cataloging agencies. VIAP 
identifiers were chosen for thè prestige of thè VIAP project, while ISNI identifiers for thè wide range 
of contributions and its compliance with thè International Organization for Standardization (ISO). 

2. Adding persistent identifiers 

The manual process of adding identifiers to authority records can be a tedious operation, and can 
slow down thè productivity of thè cataloguing process. Steps span from accessing one or more 
websites, pasting thè searched name, comparing results with data available in our library, modifying 
thè search if not successful, and copying thè identifier. Often, other important data are manuaUy 
gathered and reported, such as dates, gender, languages used by thè author, alternative names. 

A way to face this complexity was studied and applied to Koha, looking for thè possibility to search, 
retrieve and save valuable data programmatically. A Koha extension was written and applied to thè 
cataloguing interface, both in thè record view page and in thè update page. On loading these pages, 
thè browser accesses thè ISNI search programmable interface (API)^ using thè personal or corporate 
name of thè record. In a floating window, results are compared with our record, highlighting matches 
based on thè International Standard Book Number (ISBN). In case of no matches on ISBNs, thè 
cataloguer also can visually detect a match through two lists of titles, from ISNI and from our catalog. 


^ https://viaf.org . VIAF gathers thè authority files of national libraries and other projects, and merges them into a cluster 
record that brings together thè different names used worldwide for thè same entity, 

^ httDs://koha-communitv.org . Koha is a Maori name for grft, and not an acronym. It is a 20 years old open source ILS with 
a strong spread all over thè world. 

^ URBE (Unione Romana Biblioteche Ecclesiastiche, http://www.urbe.it ) is a network of 18 academic institutions, The 
Pontificia Università della Santa Croce belongs to URBE. 

■* http://www.rda-rsc.org/ europe . 

^ http://www.isni.org/ . For a discussion on ISNI, see MacEwan, Angjeli, Gatenby 2013. 

^ http://isni.oclc.nl/sru/ . 
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In case tag 024 has no occurrences, a button allows to capture both ISNI and VIAF identifiers and 
report them into two new occurrences, in compHance with MARC 21 rulesd 

024 7# $2 viaf $a viaf identifier 

024 7# $2 isni $a isni identifier without spaces 

The first indicator, set at vaine “7”, States that thè source is specified in subfield “2” with a coded 
value listed in “Standard Identifier Source Codes”.® 

We chose to store in subfield “a” thè basic information, i.e. thè identifiers instead of thè Uniform 
Resource Identifiers (URI), leaving other applications to rebuild URIs, or build links and Services in 
thè simplest way as possible, as described later. 

Thanks to this approach, in about five years, 14,100 authority records were supplied with identifiers, 
for 78,500 connected bibliographic records on a total of about 174,300 (45%). This percentage will 
dramatically increase at thè end of Phase II, as shown below in Section 9. 

3. Working on backlogs 

The previous task was performed manually also on some established authority records, especially for 
famous authors, authors related to our University or to thè URBE network, in an effort of adding 
information that is often just within our reach. At thè same time, these records were declared as RDA 
compliant^ in case tags 024, 046, 670 where present. 

However, thè ordinary work of cataloguing doesn’t allow for a good pace to enrich thè full authority 
catalog. This is why we studied and realized a specific batch procedure. We based this task on our 
experience on bibliographic records. In thè past years, in fact, we performed two other automatic 
enrichments, thè first one to add Dewey codes (Bargioni \_et ali] 2013) and thè second one, following 
RDA recommendations once again, to add relationship codes and designators (Bargioni 2016) to 
authors, in subfields lxx$e and 7xx$e^“ of bibliographic records. 

Both enrichments represented a boost for this kind of information in our records. From thereafter, 
cataloging activity includes their use, since thè staff -who directly participated in thè processes- is 
trained to do it. 

We consider this boost effect very valuable, especially for authority records. This latter sometimes can 
represent an extra work for cataloguers, especially if an authority department can’t be established. 
Moreover, it also allows to introduce new Services for end users, such as thè AuthorityBox, discussed 
in Section 10. 

4. Enrichment of records without VIAFid (phase I) 

Batch reconciliation of locai data with other data from extemal sources can be defined as a process 
of ensuring that two sets of records are in agreement. It can be performed in different ways; from 



^ https://www.loc.gov/marc/authoritv/ad024.html . 

^ https://www.loc.gov/ standards/sourcelist/standard-identifier.html . 

^ This compliance is stated in tag 040 subfield “e” set to “rda”. 

For 111 and 711, $j is thè subfield to use. 
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Online matches to offline comparisons with other datasets. The recordset to reconcile with an external 
dataset usually require a fuzzy matching to match entries; thè reconciliation is successful when a 
unique ID from thè external dataset is retrieved and stored in thè locai record. 

We studied four approaches: (1) using OpenRefined^ (2) using a fulTonline query of VIAF data, (3) 
a fuU offline procedure, and (4) a mix of offline and online. 

The (1) procedure based on OpenRefine can he summarized in these steps; extract records or their 
valuable data, like heading and locai identifier, fili in a OpenRefine project, use thè VIAF reconcile 
function'^ on thè heading column, manually resolve uncertain matches on thè new column generated 
by thè function, add tag 024 to MARC records and update them on thè catalog. 

For each name, OpenRefine reconciliation Services usually return a group of names for each name, 
that can require manual fixings. 

The (2) Online approach^^ consists on querying thè VIAF autosuggest Service, keeping similar names, 
trying to detect a match based on ISBNs. If a match is satisfactory, thè locai record can be enriched 
and saved. This method continuously connects to thè VIAF Service, of course with a slow pace to 
gentle access it. It can be performed using only one name at thè time, i.e. a locai name like “lohannes 
Paulus PP. II, s., 1920-2005.” couldn’t have a match in VIAF, while one of its alternate names 
available in our record could (e.g.; John Paul II, Pope, 1920-2005; Jan Pawel II, papiez, 1920-2005; 
Giovanni Paolo II, papa, 1920-2005; Wojtyla, Karol, 1920-2005;...). This is due to locai variants, 
applied to ancient and religious authors, following a cataloging principio adopted by thè URBE 
network from 2009, in compliance with AACR2 and in partial accordance with thè Vatican Library. 
Thus we studied thè (3) offline approach, to avoid a high number of connections to thè VIAF Service. 
VIAF offers its big data in a variety of formats in thè Data bouree page.^'* The dataset containing all 
clusters^^ is a very largo file especially when it’s expanded, loaded and indexed in a database System: 
too much for our organization. 

This is why we chose and applied thè (4) partial offline solution: we stored locally only headings, 
ISBNs and dates from thè dataset of VIAF clusters, to perform matches offline, and retrieve thè full 
cluster Information only when a match is detected. This solution allowed also to work on authority 
records one at a time, and perform thè batch process during thè daily cataloguing work and thè online 
public access to thè System. Moreover, we also avoided extracting, updating, importing and 
reindexing a largo amount of authority records, a procedure that requires a period of inactivity of thè 
System. 



http://openrefine.org . 

OpenRefine, starting from version 2.8, Comes with two preinstalled reconciliation Services, VIAF and Wikidata. A tutoria! 
https://github.eom/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API allows to develop reconciliation Services for 
other sources, https://www.wikidata. 0 rg/wiki/Wikidata:Tools/OpenRefine illustrates how to use OpenRefine for Wikidata 
reconciliation. 

Two examples of this method are: https://gist.github.com/nichtich/832052 by J. Voss (2010), and 
http://infomotions.eom/blog/2016/05/viaf-finder/ by E. L. Morgan (2016). Both accessed September 10, 2019. 
http://viaf.org/viaf/data . 

http://viaf.org/viaf/data/viaf-20190203-clusters-marc21.iso.gz . VIAF regenerates thè dataset page regularly, on a monthly 
basis. This link is now invalid, since datasets are replaced monthly. 
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5. The reconciliation process in detail 

The VIAF dataset of clusters was downloaded^*" and expanded. From a download size of 10.39 
gigabytes, thè resulting MARCXML records occupied 33.46 gigabytes. 

A Perl script^^ was written to extract VIAF identifiers, names, ISBNs and dates.^® A SQLite^^ database 
was defined with these columns and filled in with these data, in thè same Koha server. The database 
consisted of 59,168,458 rows. Its first two columns were indexed, rising thè size of thè database up to 
5.1 gigabytes, and forming something similar to a knowledge base. Despite its size, SQLite always 
performed very quickly, avoiding in this way to set up a DBMS server dedicated to this time limited 
task. 

At thè end of thè preparation phase, a Perl script was written to run in Koha. Its task can be resumed 
in thè following steps: 

from thè Koha catalog, select authority records to be enriched, i.e. lacking tag 024 
for each selected authority 

extract rows from thè SQLite database, starting from names in tags 100$a and 
eventually from 400$a (alternative names) 
retrieve their dates from tag 046 or 100$d 
if a match on dates occurs, enrich thè selected authority 
otherwise, retrieve associated ISBNs from bibliographic records 
if a match on ISBNs occurs, enrich thè selected authority 
log thè results of thè operation for tracing and statistica! purposes 
end of thè process. 

“Enrich thè selected authority” is thè core task. It consists of accessing thè VIAF cluster through thè 
VIAFid detected in thè match, retrieve thè cluster XML record,^” extract valuable data and update 
thè locai authority record. 

For valuable data we selected: 

persistent identifiers, other than thè VIAFid, like Iccn, wikidata, isni, idref, uri (dnb or bnf), 

bnfcg, datoses, vatlib, nukat,^^ registered in tag 024; 

dates ofbirth and death, in tag 046; 

gender, in tag 375; 

languages, in tag 377; 

links to Wikipedia pages, in tag 856. 


VIAF offers its dusters in a variety of formats, as well as other datasets, updated about once a month. 

We adopted Perl for this project because Koha and its libraries are written in this language. Perl libraries used: DEI, 
XML::LibXML, Business::ISBN, MARC::Record, JSON, List::MoreUtils, and C4::AuthoritiesMarc to read and write 
authority records in thè Koha environment. 

While ISBN based matches can be intuitive, libraries can also use date matches to identify people. For a discussion, see 
Toves and Hickey 2014. 

SQLite httDs://www.salite.org . as described in its Wikipedia page https://en.wikÌDedia.ora/wiki/SOLite , “is a relational 
database management System (RDBMS) contained in a C library. In contrast to many other database management Systems, 
SQLite is not a client-server database engine. Rather, it is embedded into thè end program”, or it can be used standalone. 

A VIAF cluster can be downloaded using thè URL https://viaf.org/viaf/VIAFid/viafxml . 

These source codes of national libraries and international projects, where chosen as a base for new Services - usuaUy links 
- we could add to thè OPAC in thè future, and especially to greatly enrich our records if published in a LOD format. 


179 







JLIS.it 11, 1 (January 2020) 

ISSN: 2038-1026 online 

Open access article licensed under CC-BY 

DOI: 10.4403/jlis.it-12595 



A private note was also added in tag 667 to log thè update. 

This reconciliation process is quite similar to others applied elsewhere, like thè Share-VDE projecd^ 
that Works on a knowledge base too. It is also quite similar to thè process described for instance by 
Manzotti 2010 to explain thè algorithm used by OCLC to group authority records in VIAF clusters. 
Our approach tried to minimize thè duration of thè process, thè impact on our server and on thè 
VIAF server, and maximize thè accuracy of each match. This is why we preferred to avoid a third 
level of match, i.e. a match on titles, whose results are hardly ever exact, thus requiring thè use of a 
threshold of similarity that could always be questionable. 

6. Some statistics of thè reconciliation process 


clusters in thè VIAF db 

26,064,385 


authority records to reconcile 

74,931 


records not found in VIAF db 

10,942 

14.60% 

records not reconciled 

30,710 

40.98% 

records reconciled 

33,279 

44.41% 

records reconciled through a date match 

576 

1.73% 

records reconciled through a ISBN match 

32,703 

98.27% 


Table 1. Reconciliation process statistics 


Table 1 shows that a large number of authority records were enriched, most of them through a ISBN 
match. Records not found represent records that are not available elsewhere but in our catalog, 
especially due to publication related to master degrees at our University and others URBE institutions. 
Regarding thè quality of thè match process, we consider that it was very successful. Very few records 
were linked to a wrong cluster, and we usually noticed that thè error depended on some kind of 
difficulty faced also by VIAF in its clusterization process due to very common names and sumames, 
especially of Spanish or English/USA authors. 

About persistent identifiers, each recorded in separated occurrences of tag 024, we observed this 
distribution (Table 2); 


sources (024 $2) 

occurrences 

% 

occ/rec 

uri (dnb or bnf) 

40,944 

19,37% 

1,23 

viaf 

33,279 

15,74% 

1,00 

isni 

30,623 

14,48% 

0,92 


http://www.sharc-vclc.org/ sharcvdc/clustersPl=cn . 
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Iccn 

29,825 

14,11% 

0,90 

idref 

26,001 

12,30% 

0,78 

nukat 

19,799 

9,36% 

0,60 

wikidata 

11,953 

5,65% 

0,36 

vatlib 

10,746 

5,08% 

0,32 

datoses 

6,958 

3,29% 

0,21 

bnfcg 

1,304 

0,62% 

0,04 

TOT occurrences 

211,432 




Table 2. Distribution of sources of new persistent identifiers 


Of course, thè VIAFid was always added, while ISNI represents thè second important identifier. 
Note that fot tags different from tag 024, we didn’t add thè information if it was already present. This 
means that thè following figures and statistics refer to new tags. 

Figures about dates in tag 046 are shown in Table 3. 


Dates added (tag 046) 

26,910 

Dates of birth added (046$f) 

26,742 

Dates of death added (046$g) 

7,863 


Table 3. Dates of birth (subfield f) and death (subfield g) 

Tag 046 has thè form: 

046 ## $2 ÌS086OI $f yyyy[mm[dd]] $g yyyy[mm[dd]] 

where thè month and day can be unspecified if unknown. Different precisions of dates is not thè main 
issue about this group of data: despite thè effort of thè date-processing procedure used by VIAF^^ to 
assign birth and death dates to a cluster, they are questionable and problematic especiaUy for 
uncertain dates of ancient or little-known authors. Each cluster in VIAF representing a person has a 
date range stored as two dates (min and max) consisting of a year, month and day. Together with thè 
date range there is an indication of what they represent: lived (thè dates are birth and/or death dates), 
flourished (thè dates show when thè person was active), circa (thè dates are approximate). Even if we 
limited thè import to type lived, we detected some issues that suggest to apply a special attention, up 
to reject this information. Here are some examples of wrong lived dates (as of September 2, 2019): 


Date management in VIAF is widely described in Toves and Hickey 2014. 
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name 

VIAF 

birth date 

death date 

Aubert, Paul 

272049937 

19480704 

1950 

Ross, Alex 

73992084 

19700122 

1950 

Moltmann, Jùrgen 

108285879 

1926 

1950, but he stiU lives 

Albert von Siegburg 

68169738 

1456 

1456 


Table 4. Examples of wrong dates in VIAF clusters (as of September 10, 2019) 


Figures about gender are listed in Table 5. 


Tag 375, subfield a 

24,986 


uomo 

20,930 

83.77% 

donna 

4,056 

16.23% 


Table 5. Gender tag 375 added, by vaine 


Figures about languages are listed in Table 6. They were added to 24,565 records, and - of course - 
more than one occurrence could be added in a record. 


eng 

8,634 

33.43% 

ita 

4,809 

18.62% 

ger 

3,565 

13.80% 

fre 

3,502 

13.56% 

spa 

2,874 

11.13% 

lat 

587 

2.27% 

dut 

300 

1.16% 

heb 

179 

0.69% 

gre 

165 

0.64% 

poi 

143 

0.55% 

other languages 

1,066 

4.13% 

Total number of occurrences 

25,824 



Table 6. Languages added to tag 377 
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Tag 377 has thè form: 

377 ## $2 iso639-2 

$a language code 

$1 language term in Italian 

Table 7 lists thè number of links to Wikipedia added to tag 856, like in 
856 4# $u https://en.wikipedia.org/wiki/Pope John Paul II 
A total of 20,618 occurrences were added to 10,101 records. 


English 

en .wikipedia. org 

5,711 

27.70% 

French 

fr.wikipedia.org 

3,534 

17.14% 

Italian 

it.wikipedia.org 

3,076 

14.92% 

German 

de.wikipedia.org 

5,480 

26.58% 

Spanish 

es.wikipedia.org 

2,817 

13.66% 


Table 7. Links to Wikipedia pages 

We chose to limit thè links to thè languages mainly used in our University, since VIAF clusters usually 
contain links to Wikipedia in many other languages. 

7. Enrichment of records with VIAFid (phase II) 

Records with thè VIAFid in 024 were registered during thè past years without systematically adding 
other information. The missing information, and more, can be retrieved from thè corresponding VIAF 
cluster. The process described in Section 4 was performed through a Perl script based on a software 
library whose objects and methods were reused for this second phase of thè project. 

A total of 15,274 records to enrich were selected searching for thè presence of tag 024 containing a 
VIAFid, but of course not involved in thè previous enrichment phase. For each of them, thè 
corresponding VIAF cluster was downloaded, in XML format.^*^ Tags were added if not present, or 
even enriched when already present: 


024 

Persistent identifiers other than VIAFid 

040 

Description conventions 

046 

Dates of birth and death, including month and day when available 


Again using thè URL https://viaf.ore/viaf AtlAFid/viaf.xml . 
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375 

Gender 

377 

Languages used by thè author 

856 

External links to some Wikipedia pages 


Table 8. Tags added to records with VIAFid 

Tag 040 was present in all modified records, stating thè originai cataloging agency, thè language of 
cataloging, thè transcribing agency; subfield “e”, that declares thè Description conventions,^^ was set 
to thè vaine “rda” only if at least fields 024, 046, 670 were present after thè enrichment phase. 

8. Statistics about thè enrichment of records with VIAFid 

This phase II involved, as said before, 15,274 authority records. While 15,155 were modified, only 
119 weren’t modified due to a lack of interesting information in thè VIAF cluster. For 235 of them 
was also possible to correct thè VIAFid stnce in thè meanwhile thè VIAF data processing modified 
their cluster ids.^*" 

This enrichment phase was applied not only to personal names, but also to any kind of authority 
record (except for subject headings): 


Personal names 

14,737 

Corporate names 

315 

Meeting names 

87 

Preferred (uniform) titles 

14 

Geographic names 

2 

Total 

15,155 


Table 9. Modified authorities by type 

9. Global statistics after enrichments 

At thè end of phase I and phase II of thè enrichment process, figures about authority records can be 
represented as in Table 10, by type and in accordance with RDA; 


It must be a code from https://wwwloc.gov/ standards/sourcelist/ descriDtive-conventions.html . 

VIAF publics thè redirected identifiers once a month, in thè Data Source page. The published file 
http://viaf.org/viaf/data/viaf-YYYYMMDD-persist-rdf.xml.g 2 shows redirections within thè VIAF dataset (in RDF). Even 
if some VIAF Services, like retrieving thè cluster in XML format, automatically redirect to thè current VIAFid, we think that 
catalogs have to update periodically their URIs to VIAF clusters, applying thè redirects listed in this file, 
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type 

non-RDA 

RDA 

type total 

RDA/non-RDA % 

Personal names 

84,437 

14,934 

99,371 

15.03% 

Corporate names 

4,023 

255 

4,278 

5,96% 

Preferred titles 

3,835 

273 

4,108 

6.65% 

Subject headings 

2,448 

983 

3,431 

28.65% 

Meeting names 

1,767 

68 

1,835 

3.71% 

Geographic names 

27 

17 

44 

38.64% 

TOTAL 

96,537 

16,530 

113,067 

14.62% 


Table 10. Counts and distribution of RDA authority records 

At thè moment, 48,724 authority records have a VIAFid and they are related to 165,610 bibliographic 
records as of 175,827. This means that about 94% of our bibliographic records contain at least a link 
to an authority record linked to thè VIAF. 

10. AuthorityBox 

Discussing about how to improve thè use of authority record at an end-user level, we concluded that 
it was possible to show information from authority records as well as from external sources accessed 
through persistent identifiers. We wrote AuthorityBox as an extension of our OPAC to add 
information for each agent involved in a bibliographic record. 

We added AuthorityBox in thè detail page of bibliographic records. It has thè form of an accordion, 
a group of cards or boxes, each related to an agent. A special box was appended for settings, help and 
info on this extension. Boxes are loaded asynchronously, and are shown closed except for thè first 
one. 
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AuthorityBox ^4’ 


Einstein. Albert 



H 14/03/1879 -18/04/1955 
0 inglese; tedesco 

it.wikipedia.org: Albert Einstein (Ulma, 14 marzo 1879 
- Princeton, 18 aprile 1955) è stato un fìsico e filosofo 
tedesco naturalizzato svizzero e statunitense. 

E uomo 

^ 35 relazioni con 26 autori 
I® 28 opere in questo catalogo 
□ WorIdCat Identities 
@ WorIdCat IDNetwork 
[J ricerca su Google Books 
E link permanente (permalink) 



Bom . Hedwi g 
Russell. Bertrand 


Bom . Max 
Heisenber g . Werner 

S'®'©. 


Image 1. An example of AuthorityBox, part of thè bibliographic record http://catalogo.Dusc.it/bib/95161 

Each box (image 1) may contain: 

information from thè locai authority record: 

dates, gender, languages, biographical or historical information 
links to locai Services: 

digitai repository, other bibliographic records, etc. 
links to remote Services: 
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WorldCat Identities, WorldCat IDNetwork, etc. 
a thumbnail of thè author from Wikidata or from a locai repository 
links to Wikipedia pages 
thè permalink of thè authority record. 

Two black arrows facilitate thè opening or closing of all boxes at thè same time. Settings in thè last 
box control thè info shown in each component, and thè size of icons and picture. Settings are saved 
in a cookie. A “Help” and an “About” section are also provided. 

Authority Box seems to be thè first of its kind. User functions involved in thè use of AuthorityBox are: 
search catalog, identify an item of interest, view record, see, leam, and easily navigate^^ to more 
information based authority items. It stretches thè definition of a library catalog beyond an inventory 
list, towards a knowledge tool. 

Examples of AuthorityBox: 

http://catalogo.pusc.it/bib/182859 ( 1 author) 
http://catalogo.pusc.it/bib/95161 (5 authors) 
http://catalogo.pusc.it/bib/88801 (10 authors) 

Note that thè thumbnail, when available, is retrieved by thè browser itself, that queries thè SPARQL 
endpoint of Wikidata^® to obtain thè URL and load thè image. 

11. Future developments 

11.1 Enrichment of new records 

The acquired quality level in authority records must be maintained. There are two ways to obtain this 
goal, in our opinion: 

enrich thè new records on a regular basis, say once a day; 

add a tool to Koha in thè staff cataloging interface to gather useful information from a VIAF 
cluster while cataloging a new authority record. 

The first solution simply requires to apply to new authority records thè same script used in Phase IL 
This is our current procedure. Anyway, thè second solution could ensure a better quality of thè record, 
sin ce thè catalogers can immediately verify thè new added information. 


They resemble thè FRBR model See Barbara Tillet, What is FRBR? A conceptual model for thè Bibliographic Universe 
(2005), p. 5, 

An example: https://auerv.wikidata.org/sparalPformat=ison&auerv= where thè query parameter is thè foUowing 
SPARQL query: 

SELECT * 

WHERE { ?p wdt:P214 "VIAEid" . ?p wdt:P18 Pimage } 

LIMIT 1 


187 








JLIS.it 11, 1 (January 2020) 

ISSN: 2038-1026 online 

Open access article licensed under CC-BY 

DOI: 10.4403/jlis.it-12595 



11.2 Update of redirected VIAFids 

A maintenance procedure could be applied to authority records, to follow thè proceedings of thè 
VIAF clusterization process. The redirect file from VIAF Data bouree can be downloaded once a 
month and new redireets applied to locai authority records.^^ 

11.3 Preferred Titles 

Persistent identifiers added to preferred (formerly, uniform) titles can increase thè navigability of thè 
catalog and link our records to records of other main catalogs. This enrichment could introduce more 
valuable relationships in our Koha catalog, towards thè introduction of other LRM^° based 
functionalities. 

11.4 URIs in Wikidata 

A proposai to add an identifier of our authority records in Wikidata was filed in August 2018,^^ and 
accepted: property P5739^^ is now available. Using thè SPARQL endpoint of Wikidata, it is possible 
to select Wikidata entities that bave property P5739 as well thè VIAFid in property P214.^^ This 
means that we can consider to enrich Wikidata entities adding P5739 using thè match from VIAFids 
stored in our records and in Wikidata entitiesd'* 

11.5 Linked Open Data 

The next naturai step could be thè publication of our bibliographic and authority records in RDF 
format and contribute to thè semantic web at thè highest level. This goal will require, first of all, to 
add URIs to entities in thè bibliographic records and publish a static or a dynamic triple store. 

12. Conclusions 

RDA improves thè importance of authority records, thanks to thè role it gives to URIs and persistent 
identifiers associated with entities. Even libraries with low resources can face this task and cooperate 
to thè semantic web with their catalogs. Data from important institutions, statically or dynamically 
available on thè net, allow reconciliation procedures required to process existing records and expose 
them for powerful Services and connections. And “help our users with better access to thè ridi 
resources we bave to offer them” (Tillett 2016, 22). 


The file http://viaf.org/viaf/data/viaf-20190505-Dersist-rdf.xml.g2 was about 112 megabytes in length. It contained 
7,405,741 identifiers redirected to 4,748,305. Of course, only a little part of them needs to be applied to thè locai catalog. 

IFLA LRM is described in https://www.ifla.org/files/assets/cataloguing/frbr-lrm/ifla-lrm-august-2017 rev201712.pdf 
and is implemented by RDA. 

See https://www.wikidata. 0 rg/wiki/Wikidata:Pr 0 Dertv proposal/PUSC ID. 

See https://www.wikidata. 0 rg/wiki/Pr 0 Dertv:P 5739 . 

See https://www.wikidata. 0 rg/wiki/Pr 0 Dertv:P 2 i 4 . 

This work was realized for thè project SHARE Catalogne and described in Possemato and Forziati 2019. 
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